Automatic Disambiguation of Homographic Heterophone Pairs Containing Open and Closed Mid Vowels

نویسندگان

Christopher Shulby

Gustavo Mendonça

Vanessa Marquiafável

چکیده

The issue of openness in Brazilian Portuguese vowels is a question not yet satisfactorily explored in the field of automatic classification of homographic heterophones (HH). Therefore, we aimed to develop and test a pilot classifier which assists in the automatic disambiguation of HH. For this purpose, a set of 226 word pairs of HH with the unique grammatical classes, distinguished by alternating mid vowels [e, E] and [o, O], was analyzed. The results showed that the rules proposed herein solve most disambiguation problems of HH word pairs containing mid vowels in the corpus analyzed and can be applied to TTS and ASR applications. The data also revealed that a predominant trend of non-verb classes exists, and, for some word pairs, that value can reach 95% occurrence.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Acoustic phonetic properties of mid vowels in New Caledonian French

This paper investigates production of the mid vowels /e, ɛ, ø, œ, o, ɔ/ by four speakers of New Caledonian French (NCF). Formant and durational properties of these vowels are examined with respect to the type of syllable in which they occur. Results point to general adherence to the loi de position in NCF, such that the close-mid vowels occur in open syllables and the open-mid vowels occur in c...

متن کامل

Desambiguação de Homógrafos-Heterófonos por Aprendizado de Máquina em Português Brasileiro (A Machine Learning Approach for Homographic Heterophone Disambiguation in Brazilian Portuguese)

To improve the quality of the speech produced by a text-to-speech system, it is important to obtain the maximum amount of information from the input text that may help in this task. In this context, the word sense disambiguation plays an important role and still be a central problem for natural language processing applications. This paper proposes to model the ambiguity of words as a supervised...

متن کامل

The Automatically Built up Homograph Dictionary a Component of a Dynamic Lexical System

Ambiguous word forms (often called "homonyms " or in written language "homographs ") are known as obstacles in many fields of computational linguistics, especially in automatic documentation, content analysis or mechanical translation. In this respect two problems must be distinguished: 1) the detection of homographic word fonus in the text, 2) their disambiguation by analysis procedures. This ...

متن کامل

BuzzSaw at SemEval-2017 Task 7: Global vs. Local Context for Interpreting and Locating Homographic English Puns with Sense Embeddings

This paper describes our system participating in the SemEval-2017 Task 7, for the subtasks of homographic pun location and homographic pun interpretation. For pun interpretation, we use a knowledgebased Word Sense Disambiguation (WSD) method based on sense embeddings. Punbased jokes can be divided into two parts, each containing information about the two distinct senses of the pun. To exploit t...

متن کامل

بررسی نقش انواع بافتار هم‌نویسه‌ها در تعیین شباهت بین مدارک

Aim: Automatic information retrieval is based on the assumption that texts contain content or structural elements that can be used in word sense disambiguation and thereby improving the effectiveness of the results retrieved. Homographs are among the words requiring sense disambiguation. Depending on their roles and positions in texts, homograph contexts could be divided to different types, wit...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2013

Automatic Disambiguation of Homographic Heterophone Pairs Containing Open and Closed Mid Vowels

نویسندگان

چکیده

منابع مشابه

Acoustic phonetic properties of mid vowels in New Caledonian French

Desambiguação de Homógrafos-Heterófonos por Aprendizado de Máquina em Português Brasileiro (A Machine Learning Approach for Homographic Heterophone Disambiguation in Brazilian Portuguese)

The Automatically Built up Homograph Dictionary a Component of a Dynamic Lexical System

BuzzSaw at SemEval-2017 Task 7: Global vs. Local Context for Interpreting and Locating Homographic English Puns with Sense Embeddings

بررسی نقش انواع بافتار هم‌نویسه‌ها در تعیین شباهت بین مدارک

عنوان ژورنال:

اشتراک گذاری